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Abstract 

We motivate and propose a new way of thinking about failure detectors which allows us 
to define, quite surprisingly, what it means to solve a distributed task wait-free using a failure 
detector. In our model, the system is composed of computation processes that obtain inputs 
and are supposed to output in a finite number of steps and synchronization processes that are 
subject to failures and can query a failure detector. We assume that, under the condition that 
correct synchronization processes take sufficiently many steps, they provide the computation 
processes with enough advice to solve the given task wait-free: every computation process 
outputs in a finite number of its own steps, regardless of the behavior of other computation 
processes. Every task can thus be characterized by the weakest failure detector that allows 
for solving it, and we show that every such failure detector captures a form of set agreement. 
We then obtain a complete classification of tasks, including ones that evaded comprehensible 
characterization so far, such as renaming or weak symmetry breaking. 



1 Introduction 



What does it mean to solve a task? A distributed task for a set of processes can be seen 
as a function that maps an input vector to an output vector, one value per process. It is easy to 
reason about correctness of a task solution by matching the outputs to the inputs with respect 
to the task specification. When it comes to progress, however, it is getting less trivial. 

On the surface, it is desirable to expect that the input vector is exactly matched by the 
output vector, i.e., every participating process obtains an outputQ Unfortunately, in asynchronous 
or partially synchronous systems where relative processes' speeds are unbounded or very large, 
ensuring this property would require very long waiting. A more natural wait-freedom property 
requires that any participating process that takes sufficiently many steps obtains an output, 
"regardless of execution speeds of other processes" (20] . A wait-free task solution thus allows for 
treating the requirement "a given participant outputs" as a liveness property [2] : every execution 
has an extension in which the requirement is met. Naturally, wait-freedom assumes no notion of 
process failures: a process that does not take steps for a while in a given execution, always has a 
chance to wake up and take enough steps to output. 

Failure detectors. Unfortunately, very few tasks can be solved wait-free in the basic read-write 
shared- memory model [HI [25l (2TJ [271 El E] ■ The failure detector abstraction [HI [TO] was proposed 
to circumvent these impossibilities. Intuitively, a failure detector provides each process with some 
(possibly incomplete and inaccurate) information about the current failure pattern, e.g., a list of 
processes predicted to take only finitely many steps in the current execution. The failure detector 
abstraction gives a language for capturing the weakest support from the system one may require 
in order to solve a given task. This gave many interesting insights on the nature of "wait-free 
unsolvable" tasks, starting from the celebrated result by Chandra et al. on the weakest failure 
detector for consensus [S]c| 

A solution of the task using a failure detector guarantees that every correct (a process that is 
predicted to take infinitely many steps by the failure pattern) eventually obtains an output. The 
progress of each process may thus depend on the behavior of other correct processes, and therefore 
failure detector-based algorithm cannot be wait-free. Consequently, since the failure pattern is 
introduced as a part of a run, we cannot treat individual progress as a liveness property anymore: 
a process is not allowed to take steps after it crashes. 

Wait-freedom with advice. But can we think of a system where a "hard" task can be solved 
so that progress of a process does not depend on the execution speeds of other processes? A 
straightforward way to achieve this is to assume that the processes receive advice from an external 
oracle, and an immediate question is what is the weakest oracle that allows for solving a given 
task so that every participating process taking enough steps outputs. 

In this paper, we use the language of failure detectors to determine the relative power of such 
external oracles. The oracle is represented as a set of synchronization processes equipped with a 
failure detector: each synchronization process can query its failure detector module to get hints 
about the failures of other synchronization processes. Thus, our system only considers failures 
of synchronization processes. As in the classical failure-detector literature [9], the assumptions 
about when and where failures of synchronization processes can occur are encapsulated in an 
environment, i.e., a set of allowed failure patterns. Computation processes (participants in a 
task solution) and synchronization processes communicate by reading and writing in the shared 
memory. 

1 A process is considered participating if it takes at least one steps in the computation. 

2 Informally, T> is the weakest failure detector to solve a task T if it (1) solves T and (2) can be deduced from 
any failure detector that solves T. 
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Now what do we mean by solving a task with a failure detector? We require that, under the 
condition that the synchronization processes using their failure detector behave as predicted by 
the environment, every computation process taking enough steps must output. 

It is easy to see that the classical failure-detector model [9] is a special case of our model where 
there is a bijective map between computation and synchronization processes, and a computation 
process stops taking steps after its synchronization counterpart does. Strictly speaking, when it 
comes to solving tasks, our framework demands from a failure detector more than the conventional 
failure detector model does. Indeed, in our framework, the failure of a synchronization process 
does not affect computation processes, and a failure detector is supposed to help computation 
processes output, as long as they take enough steps. In particular, we observe that the weakest 
failure detector to solve a task T in our framework is at least as strong as the weakest failure 
detector for T in the conventional model [9]. 

Ramifications. The idea of separating computation from synchronization is not new, e.g., it is 
used in the celebrated Paxos protocol [21] separating proposers from acceptors and learners. But 
applying it to distributed computing with failure detectors results in a surprisingly simple model, 
which we call external failure detection (EFD), which resolves a number of long-standing puzzles. 

The use of EFD enables a complete characterization of distributed tasks, based on the "amount 
of concurrency" they can stand. In the classical framework, we say that a task T can be solved 
^-concurrently if it guarantees that in every ^-concurrent run every process taking sufficiently 
many steps eventually outputs [16] . Informally, a run is ^-concurrent if at each moment of time, 
there are at most k participating processes without outputs. Now, in a system of n processes, 
each task T is associated with the largest k (1 < k < n) such that T can be solved /c-concurrently. 

We show that in EFD, a failure detector D can be used to solve a task T with "concurrency 
level" at most k if and only if V can be used to solve k-set agreement. More precisely, we show 
that, in every environment, i.e., for all assumptions on where and when failures of synchronization 
processes may occur, any failure detector that solves T is at least as strong as the anti-O-A: failure 
detector (SUES], denoted ~^£lk- Then we describe an algorithm that uses to solve T (or any 
task that can be solved /c-concurrently) , in every environment. 

Thus, any task is completely characterized through the "level of concurrency" its solution can 
tolerate. All tasks that can be solved fc-concurrently but not (k + l)-concurrently (e.g., k-set 
agreement) are equivalent in the sense that they require exactly the same amount of information 
about failures (captured by -ififc) to be solved in EFD. Note that this characterization covers all 
tasks, including "colored" ones that evaded any characterization so far [T3J [TH], [T] . 

Consider, for example, the task of (j, £)-renaming in which j processes come from a large 
set of potential participants and choose new names in a smaller name space 1, . . . ,£, so that no 
two processes choose the same name. Surprisingly, in the conventional model, the renaming task 
itself can be formulated as a failure detector, so the question of the weakest failure detector for 
solving it results in a triviality. To avoid trivialities, additional assumptions on the scope of failure 
detectors are made pp. 

In EFD, however, we immediately see that (j, j')-renaming (also called strong renaming) cannot 
be solved 2-concurrently and is thus equivalent to consensuso More generally, determining the 
weakest failure detector for (j, £)-renaming boils down to determining the maximal k (1 < k < j) 
such that the task can be solved ^-concurrently. We show finally that (j, j + k — l)-renaming can 
be solved ^-concurrently, and, thus, using 

Another interesting corollary of our characterization is that if a failure detector solves k-set 
agreement among an arbitrary given subset of k + 1 processes, then it is strong enough to solve 
k-set agreement among all processes. This is a generalization of the recent result of Delporte et 

3 Note that all tasks can be solved 1-concurrently. 

4 For some values of j and k, however, the question of the maximal tolerated concurrency of (j, j + k — l)-renaming 
is still open [5]. 
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al. [12] that any failure detector allowing for solving consensus (1-set agreement) among each two 
processes, also allows for solving consensus among all processes. Years of trying to show that 
the phenomenon demonstrated in [12] generalizes to all k > 1 in the conventional failure-detector 
model [9] bore no fruits. 

One important feature inherited by our EFD framework from wait-free protocols is that it 
leverages simulation-based computing: processes can cooperate trying bring all participating 
processes to their outputs. Simulations were instrumental in establishing tight relations between 
seemingly different phenomena in asynchronous systems El Q1)J EH EU E3 EH] , and we extend 
this line of research below to failure-detector models. 

Roadmap. The paper is organized as follows. First, we formally define our model and our new 
notion of task solvability with a failure detector. We then present a simple inductive proof of a 
generalization of [12] to any k > 1. Then we extend the generalization even further by presenting 
a complete characterization of decision tasks, based on the level of concurrency they can tolerate. 
Then we derive the weakest failure detector for strong renaming and wrap up with obligatory 
concluding remarks. Proofs are partially delegated to the optional Appendix. 

2 The model of external failure detection 

In this section, we propose a new definition of what it means to solve a task using a failure 
detector and relate it to the conventional definition of [5]. Parts of our model reuse elements 

of pnniiiHiEi]. 

2.1 Model for computation and synchronization 

Our system is split in two parts. The computation part is made up of processes that get input 
values for the task they intend to solve and return output values. The synchronization part is 
made up of processes that use failure detectors to help processes of the computation part. 

Processes. Formally, we consider a read- write shared-memory system which consists of m C- 
processes, Tl c = {pi, . . . , p m }, and n S -processes, H s = {qi, . . . ,q n }- We allow n and m to be 
arbitrary natural numbers, but, as we shall see shortly, the only "interesting" case is when n = m. 

Intuitively, the C-processes are responsible for computation. The S-processes are responsible 
for synchronization and may be equipped with a failure detector module [10] that gives hints 
about failures of other S'-processes. The processes in II'-'] U II s communicate with each other via 
reading and writing in the shared memory. 

Failure patterns and failure detectors. Since C-processes are assumed to be wait-free, we are only 
interested here in failures of S'-processes. Hence a failure pattern F is a function from the time 
range T = N to 2 n , where F(r) denotes the set of S- processes that have crashed by time r. 
Once a process crashes, it does not recover, i.e., Vr : F(t) C F(t + 1). faulty(F) = U T £jF(t) is 
the set of faulty processes in F and correct (F) = H s — faulty(F) is the set of correct processes in 
F. 

A failure detector history H with range 7Z is a function from II s x T to 1Z. H(qi , r) is interpreted 
as the value output by the failure detector module of S-process qi at time r. A failure detector V 
with range IZd is a function that maps each failure pattern to a (non-empty) set of failure detector 
histories with range 1Zt>. F>{F) denotes the set of possible failure detector histories permitted by 
T> for failure pattern F. 

An environment S is a set of failure patterns that describes a set of conditions on when and 
where failures might occur. For example St is the environment that consists of all failure patterns 
F such that correct (F) > n — t. We assume that for every failure pattern in the environments we 
consider, at least one S-process is correct. 
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Algorithms and runs. A distributed algorithm A using a failure detector T> consists of two collec- 
tions of deterministic automata, A\ , ■ ■ ■ , A^, one automaton for each C-process, and Af , . . . , ^4^, 
one automaton for each S-process. In a step of the algorithm, a process may read or write to a 
shared register, or (if it is a S*-process) consult its failure-detector module. 

A state of A is defined as the state of each process (state of each process being identified with 
the state of its corresponding automaton) and each shared object in the system. An initial state 
I of A specifies an initial state for every process and every shared object. 

A run of A using a failure detector T> in an environment £ is a tuple R = (F, H, I, Sch, T) 
where F € £ is a failure pattern, H 6 X>{F) is a failure detector history, / is an initial state 
of A, Sch is an infinite schedule, i.e., a sequence of processes in II C U H s , T is a sequence of 
non-decreasing elements of T. The k-th. step of run R is a step of process Sch[k] determined by 
the current state, the failure history H , T[k] and the algorithm A. If it is a step of a 5-process, 
this process is alive (Sch{k] £ F(T[k])) and the value of the failure detector for this step is given 
by H{Sch[k],T[k\). 

Let inf s (R) denote the set of processes in U s that appear infinitely often in Sch. Respectively, 
inf c (R) denote the set of processes in Tl c that appear infinitely often in Sch. We say that a run 
R = (F, H, I, Sch, T) is fair if correct (F) is equal to inf s (R), and inf (R) is not empty. A finite 
run of A is a "prefix" of a run (F, H, I , Sch,T) of A, i.e., a tuple (F, H, I , Sch' ,T') such that 
\Sch'\ = \T'\, Sch is a proper prefix of Sch, and T' is a proper prefix of T. 

Tasks. We focus on a class of problems called tasks that are defined uniquely through inputs and 
outputs. 

A task [21] is defined through a set X of input vectors (one input value for each C-process) , a 
set O of output vectors (one output value for each C-process), and a total relation A : X \- > 2° 
that associates each input vector with a set of possible output vectors. An input value equal to 
_L denotes a not participating process and _L output value denotes an undecided process. 

A m-vector L' is a prefix of a m-vector L if V contains at least one non-_L item and for all 
i, 1 < i < m, either L'[i] = 1 or L'[i] = L[i\. A set L of vectors is prefix-closed if for all L in C 
every prefix of L is in C 

We assume that each element of X and O contains at least one non-_L item and also that the 
sets X and O are prefix-closed. Moreover, we only consider tasks that have finite sets of input 
vectors X (this assumption is used in Section H] when we categorize tasks based on the failure 
detectors needed to solve them). 

We stipulate that if {1,0) € A, then (1) if, for some i, I[i] = _L, then 0[i] = _L, (2) for each 
O' , prefix of O, (I, O') G A and, (3) for each /' such that / is a prefix of I', there exists some O' 
such that O is a prefix of O' and (/', O') in A. 

For example, in the task of (U,k)- agreement, where U C II , input and output vectors are 
m-vectors, such that I[i] = _L for all p, ^ U, input values are in {_L, 0, . . . , k}, output values are 
in {_L, 0, . . . , k}, and for each input vector I and output vector O, (I, O) € A if the set of non-_L 
values in O is a subset of values in I of size at most k. (Tl c , fc)-agreement is the conventional 
k-set agreement task [11] and (H c , l)-agreement is consensus |14j . 

2.2 Solving a task in the EFD framework 

Now we are ready to define what does it mean to solve a task in the external failure detection 
framework. 

Input vector and output vector of a run. First, we assume that each automaton Af (1) gets an 
input value inputi as part of its initial state, and (2) contains decide steps such that all the next 
steps of Ai are null steps that do not affect the current state when they are executed and for each 
decide step is associated a decision value V{. 

The first step of each C-process is to write its input value to shared memory. A process that 
wrote its input value is called participating. If a C-process executes a decide step with decision 
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value v, we say that the process decides v or returns v. 

Given a run R, the input vector for the run is the m-vector / such that I{i) = inputi if pi is 
a participating process and = _L if pi is a not participating process. In the same way, the 
output vector of the run is the m-vector O such that 0(i) = v if pi decides v in the run and 
0{i) = _L if pi does not decide in the run. 

Solving a task. We say that a run R with input vector / and output vector O satisfies a task 
T = (I, O, A) if (1) (I, O) € A and (2) 0{i) = _L only if pi makes a finite number of steps 
(p t i mf c {R)). 

An algorithm A EFD-solves a task T = (I, O, A) using a failure detector T> in an environment 
£ (in the rest we simply say "solves") if every fair run of A satisfies T. If such an algorithm exists 
for task T, T is solvable with failure detector T> in environment £. By extension, a failure detector 
T> solves a task T in £ if there is an algorithm A that solves T using T> in £ . 

Note that we expect the algorithm to guarantee output to every C-process that takes suffi- 
ciently many steps, regardless of where and when S-processes fail. The algorithm only expects 
that every correct S-process in the current failure pattern takes infinitely many steps. 

Comparing failure detectors. Failure detector reduction is defined as usual: failure detector T>' 
is weaker than failure detector T> in an environment £ if ^-processes can use T> to emulate T>' 
in £. More precisely, the automata of the C-processes of the distributed reduction algorithm 
A are automata with only null steps and the emulation of T>' using D is made by maintaining, 
at each S- process qi T>' ' -output j so that in any fair run with failure pattern F, the evolution of 
variables {V -output j} g , je n s results in a history H' € V'{F). We say that two failure detectors 
are equivalent in £ if each is weaker than the other in £. 

As in the original definiton [9], if failure detector T>' is weaker than failure detector T> in 
environment £, then every task solvable with T>' in £ can also be solved with V in £. Now T> is 
the weakest failure detector to solve a task T in £ if (i) T> solves T in £ and (ii) T> is weaker than 
any failure detector that solves T in £. It is straightforward to extend the arguments of [22] to 
show that every task has a weakest failure detector. 

k- concurrency. Consider the solvability of a task without the help of a failure detector. In this 
case the deterministic automata of the <S-processes of the distributed algorithm A are automata 
with only null steps. Such an algorithm will be called restricted. 

It is clear that tasks that are solvable with a restricted algorithm are exactly tasks that are 
said wait-free solvable in the literature (e.g. in [2Q| [2Tj ) . 

The notion of k-concurrent solvability, introduced in |16| . is a weaker form of solvability: a 
task is solvable ^-concurrently if it is solvable only when at most k C-processes concurrently 
invoke the task. More precisely, a run of a distributed algorithm is k-concurrent if it is fair and 
at each time there is at most k undecided participating C-processes. A task T = (Z, 0, A) is 
k- concurrently solvable if there is a restricted algorithm A such that all ^-concurrent runs R of A 
satisfy T. Note that runs of A in which the number of participating but not decided C-processes 
exceeds k at some point may not satisfy T. 

A wait-free solvable task is ?n-concurrently solvable. Also, it is easy to show that: 

Proposition 1 Every task is 1- concurrently solvable. 

Restriction on the number of C-processes. Trivially, if a task T is solvable with a restricted 
algorithm then T is also solvable with any number of S-processes and any failure detector. Recip- 
rocally, consider an algorithm A solving a task T with a trivial failure detectoiH in environment 
£ n -i- If ti > m consider the following algorithm: each C-process pi executes alternatively steps of 
Ap. and steps of ^4^. and each S-process executes only null steps. It is easy to verify that in this 

5 A trivial failure detector always outputs _L. 
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way we emulate runs of A in the failure pattern in which at least all S-processes qi with i > m 
are crashed, and such runs satisfy task T. Hence we get: 

Proposition 2 If n > m, T is solvable in £ n —\ with a trivial failure detector if and only ifT is 
solvable with a restricted algorithm. 

But if n < m, the S'-processes may help solving the task even if they do not use their failure 
detection capacities. For example, with n S'-processes we can implement a (U c ,n)-set agreement 
in every environment. For this, each S-process waits until at least one C-process writes its input 
in shared memory, and then it writes this value to a shared variable V. Each C-process waits until 

V has been written and outputs the read value. As at least one S-process is correct, eventually 

V will be written and as there are n S-processes at most n values may be output. In this way 
the (n c ',n)-set agreement is always solvable even without the help of any failure detector. 

As we focus here on solvability where additional power of processes is only due to the failure 
detection, the only "interesting" scenario to consider is when the number of C-processes does not 
exceed the number of S-processes and more specifically the case where they are equal. Therefore, 
in the following we assume that the number of C -processes is equal to the number of S-processes, 
we denote this number by n. 

2.3 Conventional solvability 

More conventional models of computation in which there is no separation between the computa- 
tion and the synchronization part may be considered as a special case of the generalized model 
presented here. In conventional models, each process i G {1, . . . ,n} can be seen as running two 
parallel threads: pi corresponding to the computational part and qi corresponding to the syn- 
chronization part. Moreover failure patterns correspond: i is correct in conventional systems if 
and only if qi is correct in our setting. But, since in our model, computation and synchronization 
are separate, it is possible that pi makes only a finite number of steps even if qi is correct or 
vice-versa. Then we define personified runs of a distributed algorithm as being runs R that are 
fair and such that pi crashes if and only if qi crashes at the same time (as a result, inf (R) is 
equal to inf s (R)). We say that algorithm A solves classically task T with failure detector T> in 
environnent £ if every personified run R of A satisfied T. 

This definition corresponds exactly to the notion of solvability in a conventional setting as 
can be found in the literature [9]. 

As the set of personified runs of a distributed algorithm is a subset of the fair runs, we have: 

Proposition 3 If a failure detector T> solves a task T in an environment £ then T> classically 
solves T in £. 

Corollary 4 IfDis the weakest failure detector to classically solve a task T in an environment 
£, then T> is weaker than the weakest failure detector to solve T in £. 

Note that the converse of Proposition [3] is not true. For example, consider the ({pi,P2}A)~ 
agreement task (consensus among p\ and p2)- It is classically solvable in £2 (assuming at most 
2 failures) with the failure detector V that, for each S-process, outputs q± if q± is correct and 
outputs (72 if qi is faulty. But this task is not solvable in £2 with this failure detector (intuitively, 
otherwise, if qi is crashed we would be able to solve consensus between p\ and P2 without a failure 
detector). 

However for colorless taska^l both notions of solvability coincide. 

6 Informally, in a solution of a colorless task [7], a process is free to adopt the input or the output value of any 
other participating process. 
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Proposition 5 Let T be a colorless task, T is solvable with failure detector T> in environment £ 
if and only if T is classically solvable with D in £. The weakest failure detector to solve T in £ 
is the weakest failure detector to classically solve T in £. 

Failure detectors for k-set agreement. 

The failure detector -ifi^ [28] outputs, at every 5-process and each time, a set of (n — k) 
S'-processes. -ifi^ guarantees that there is a time after which some correct (S-process is never 
output: 

VF G £, ViJ G -"fifcCF), 3qi G correct{F), r G T, 
W > t, V(/j G correct (F) : qi £ H(qj,r'). 

— 1O1 is equivalent to f2 [9] that outputs a S'-process such that eventually the same correct 
S'-process is permanently output at all correct processes. 

From [18], we know that in every environment £, -ififc is the weakest failure detector to 
classically solve (U c ,k)-set agreement in £. As (H c ,k)-set agreement is a colorless task, from 
Proposition [5] we obtain: 

Proposition 6 In every environment £, -ififc is the weakest failure detector to solve (II ,k)-set 
agreement in £. 

3 Solving a puzzle 

Let U be a set of + 1 C-processes. Consider a failure detector D that solves k-set agreement 
among the processes in U. We show that T> actually solves k-set agreement among all n C- 
processes. 

Theorem 7 Let U be a set of (k + 1) C-processes, for some 1 < k < n. For every environment 
£, if a failure detector T> solves (U,k)-set agreement in £ then T> solves (II C ,k)-set agreement 
in £. 

Proof sketch. Without loss of generality, assume that U = {pi, . . . ,Pk+i}- Let A be a distributed 
algorithm that solves the (U, k)-set agreement in £ with T>. 

Let U x denote {p±, . . . ,p x }, x = k + 1, . . . ,n. We observe first that V can be used to solve 
(U x ,x — l)-set agreement as follows. C-processes in {pi, . . . ,Pk+i} and 5-processes {q\, . . . , q n } 
run A to solve fc-set agreement and return the value returned by the algorithm, and processes in 
{Pfc+2) • • • iPx} simply return their own input values. In total, at most x — 1 distinct input values 
are returned. Let A x denote the resulting algorithm. 

We proceed now by downward induction to show that for all x = n down to k, V solves 
(Ii c ,x)-set agreement. 

The base case is immediate: {p\, . . . ,p n } trivially solve (H c \n)-set agreement without any 
failure detector. Now suppose that T> solves (II^, a;)-set agreement for x > k+1. By Proposition^ 
V can be used to implement — <£l x - 

Using the generic simulation technique presented in Appendix lC.il the C-processes, pi, . . . ,p n , 
can use -^Q x to simulate a run of the C-part of A x on pi, . . . ,p x , so that at least one simulated 
process takes infinitely many stepsQ The 5-part of A x is executed by 5-processes. In the 
simulation, each simulating process proposes its input value as an input value in the first step for 
each simulated process in {pi, . . . ,p x } (this can be done, since (n c ,x)-set agreement is a colorless 
task). 

Suppose that the current run is fair, i.e., every correct 5-process takes infinitely many steps. 
Therefore, we simulate a fair run of A x and thus eventually some simulated C-process in {pi , . . . , p x } 

7 We could have used a "black-box" simulation of A x using (H c ,x)-set agreement objects presented in [TB]. To 
make the paper self-contained, we give a direct construction using -ifl x in Appendix lC.il 
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decides on one of the input values of the C-processes. Once a simulator finds out that a simulated 
process decided, it returns the decided value. Thus, eventually, every correct simulator returns. 
Since all decided values come from a run of A x , at most x — 1 distinct input values can be decided. 
Hence, T> solves (Il c ,x — l)-set agreement. □ 

Therefore, in our framework, we obtain a direct generalization of the fact that for a failure de- 
tector, it is as hard to solve consensus in a system of n processes as to solve consensus among 
each pair of processes [12j. In fact, the separation between C-processes and ^-processes, implies 
a stronger result: solving A:-set agreement among one given set of (k + 1) processes is as hard (in 
the failure detector sense) as solving it among all n processes. 

4 Generalizing the puzzle 

We showed in the previous section that solving k-set agreement among any given set of k + 1 
C-processes requires an amount of information about failures that is sufficient to solve k-set 
agreement among all n C-processes. We show below that this statement can be extended to any 
task T that cannot be solved (k + l)-concurrently. We present an explicit reduction algorithm 
that extracts -ifi^ from any failure detector that solves T. Conversely, we show that a task that 
is /c-concurrently solvable can be solved with -if^ in any environment. 

Finally, we derive a complete characterization of generic tasks : all tasks that can be solved 
/c-concurrently but not (k + l)-concurrently are equivalent in the sense that they require the same 
information about failures to be solved (-ififc). 

4.1 Reduction to -if4 

Let T be any task that cannot be solved (k + l)-concurrently. Let £ by any environment. We 
show that every failure detector T> that solves T in £ can be used to implement -if^fc in £ as 
follows. 

Let A be the algorithm that solves T using T> in £ . Recall that A consists of two parts: A c 
is run by the C-processes p\, . . . ,p n and A s is run by the S-processes q\, . . . , q n . 

First, we construct a restricted algorithm A s i m . In A s j^ m , C-processes p\,...,p n perform 
two parallel tasks. In the first task, C-processes take steps on behalf of A ■ In the second task, 
they simulate a run of A s on S-processes using, instead of V, a directed acyclic graph (DAG) 
G. The DAG G contains a sample of values output by V in some run R of A [9j [28]. In A s i m , 
S-processes take null steps. 

Informally, each run of A s ^ m gives "turns" to the S-processes and if G provides enough 
information about failures to simulate the next step of a S-process qj, the step of qj appears 
in the simulated run of A. To simulate steps of A s , C-processes employ BG-simulation (SJ [7]. 
This simulation technique enables k + 1 processes called simulators, to simulate a run of any 
asynchronous n-processes protocol in which at least (n — k) processes take infinitely many steps. 
Thus, if k or less participating C-processes take a finite number of steps, the resulting run of 
•A-sim gi yes infinitely many turns to at least n — k S'-processes. 

Let F be the failure pattern of the run in which G was constructed. A s ^ m guarantees that 
(1) every finite run of A s i m simulates a finite run of A, and (2) if every S-process that is correct 
in F receives infinitely many turns to take steps, then the simulated run of A is fair, and (3) if k 
or less participating C-processes take only finitely many number of steps, then there are at most 
k S'-processes that receive only finitely many turns to take steps in the simulation. 

Second we construct a reduction algorithm. In such an algorithm C-processes take null steps. 
Our reduction algorithm consists of two components (both are run exclusively by the S- processes) . 
In the first component, every S-process qi queries T>, exchanges the returned values with other 
S-processes and maintains a DAG Gj. In the second component, each qi locally simulates multiple 
(k + l)-concurrent runs of A s ^ m using Gj, going over all combinations of inputs, exploring the 
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runs in the depth-first manner. The simulation continues as long as some simulated C-process 
does not decide in the produced run of A s ^ m . Since T cannot be solved (k + l)-concurrently, 
there must be a (k + l)-concurrent run of A s ^ m in which some participating C-process that takes 
infinitely many steps never decides. The only reason for a C-process not to decide in a run of 
some correct S'-process receives only finitely many turns in the simulation. But in 
the simulation, at least (n — k) S-processes receive infinitely many turns. Thus, by outputting 
the identities of the (n — k) S-processes that were last to receive turns in the current run we 
emulate the output of -ifi^: we output sets of n — k S- processes that eventually never contain 
some correct process. 

Theorem 8 Let T be a task that cannot be solved (k + 1)- concurrently. For every environment 
£, for every failure detector T> that solves T in £, -ififc is weaker than T> in £. 

4.2 Solving a ^-concurrent task with ->Q k 

In this section, instead of — 'O^ , we use an equivalent failure detector Q k |28j . Basically, £l k gives 
a /c-vector of processes such that, eventually, at least one position of the vector stabilizes on the 
same correct process at all correct processes. 

By definition if T is /c-concurrently solvable, then there exists a restricted algorithm A that 
^-concurrently solves T. 

First, we define an abstract simulation technique that, with help of Ojt, allows us to simulate, 
in a system of n C-processes, runs of any restricted input-less algorithm on k C-processes (the 
set of non-_L input values is a singleton). Moreover, in this simulation, if I simulators participate 
then at most min(k,£) processes take infinitely many steps in the simulated execution. Basically, 
to perform a step for a simulated C-process pi, the C-processes and the 5-processes execute an 
instance of a leader-based consensus algorithm jlUj . using the item i of £2 ^ as a leader. The 
property of U k ensures that for some i, infinitely many consensus instances terminate. 

Second, we define a restricted algorithm B for k C-processes that simulates a ^-concurrent 
run of A, using the BG-simulation techniques (U [7] . Applying the abstract simulation technique 
to S, we obtain an algorithm in which every run R simulates a run R s im of A such that: (1) R s i m 
contains only steps of participating processes of R, (2) the inputs of the participating processes 
are the same in R and R s i m , (2) R s i m is /c-concurrent, and (3) every C-process that takes infinitely 
many steps in R takes also infinitely many steps in R s i m - So if T is /c-concurrent solvable with 
A, R S im satisfies T, and, consequently, R satisfies T. 

To sum up, we have constructed an algorithm that solves T with -ifi^: with the help of 
S-processes and ->£lk, Pi,--,Pn simulate C-processes Pi,..,p' k that, in turn, simulate C-processes 
p'{, --,Pn taking steps in a ^-concurrent execution of algorithm A. 

Theorem 9 Let T be any k- concurrently solvable task. For every environment £, solves T 
in £. 

4.3 Task hierarchy 

From Theorems [H] and we deduce: 

Theorem 10 Let T be a task that can be solved k- concurrently but not (k + 1)- concurrently. In 
every environment £, -ifi^ is the weakest failure detector to solve T in £. 

As a corollary, all tasks that can be solved ^-concurrently but not (k + l)-concurrently (e.g., fc-set 
agreement) are equivalent in the sense that they require exactly the same amount of information 
about failures (captured by -^fc). 
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5 Characterizing the task of strong renaming 



To illustrate the utility of our framework, we consider the task of (j,£) -renaming [3]. The task 
is defined on n (n > j) processes and assumes that in every run at most j processes participate 
(at least n — j elements of each vector / € I are _L). As an output, every participant obtains a 
unique name in the range {1, . . . ,£} (every non-_L element in each O £ O is a distinct value in 

{1, ■■■,*})■ 

In this section, we first focus on (j, j)-renaming (also called strong j-renaming). Using Theo- 
rem [TUJ we show that the weakest failure detector for strong j-renaming is £1 (for each 1 < j < n). 
In other words, strong renaming is equivalent to consensus. 

Note that in strong 2-renaming at most 2 C-processes concurrently execute steps of the al- 
gorithm. So the impossibility to achieve strong 2-renaming is equivalent to the impossibility of 
solving strong 2-renaming 2-concurrently. By a simple reduction to the impossibility of wait-free 
2-processes consensus, we show (Appendix iDj) : 

Lemma 11 Strong 2-renaming cannot be solved 2-concurrently. 

By reducing to the impossibility of Lemma [TTl we get a more general result: 

Theorem 12 For all 1 < j < n, strong j-renaming cannot be solved 2-concurrently. 

Proposition [TJ Theorem 1101 an d Theorem 1121 imply: 

Corollary 13 For all j (1 < j < n), in every environment £, £1 is the weakest failure detector 
for solving strong j-renaming in 8. 

In fact, there exists a generic algorithm (Appendix lD.2p that, for all A; = 1, . . . , j, solves (j,j + k — 
l)-renaming in all ^-concurrent runs, and thus (j,j + k — l)-renaming can be solved using -^fc. 
For some values of k and j, (j, j + k — l)-renaming can be shown to be impossible to solve (k + 1)- 
concurrently, for others determining the maximal level of concurrency of (j,j + k — l)-renaming 
is still an open question [8]. 

6 Conclusion 

This paper introduces a new model of distributed computing with failure detectors that allows 
processes to cooperate. A process in this model is able to advance the computation of other 
participating processes in the way used previously only in asynchronous simulations [To! [TB] , 
while using failure detectors to overcome asynchronous impossibilities. In our new framework, 
we derive a complete characterization of distributed tasks, based on their maximal "concurrency 
level": class k (1, . . . , n) consists of tasks that can be solved at most /^-concurrently, and all tasks 
in the class are equivalent to k-set agreement. 

Our framework does not have to be tied to wait-freedom. We can think of its generalization 
to any progress condition on computation processes encapsulated, e.g., in an adversary [13]. 
Therefore, we can pose questions of the kind: what is the weakest failure detector to solve a task 
T in the presence of an adversary A! This gives another dimension to the questions explored in 
this paper. 
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A Proof for 1-concurrent solvable (Section 12.2 



Proposition [T] Every task is 1-concurrent solvable. 

Proof. Each C-process pi executes the following code (1) writes its input, (2) reads the other 
inputs already written getting a vector I such that I[i] = _L, and (3) reads all the other outputs 
already written getting a vector O. If O is only composed with _L then pi is the first process, it 
chooses an output according to its input and A. Otherwise let I' obtained from I by replacing 
the i-th item with the input value of pi. By definition of tasks, if (/, O) £ A, there exists a vector 
O' obtained from O by replacing the i item by a non _L value such that (I',0') £ A. Then p^ 
decides and outputs value 0'[i]. Let R be a 1-concurrent run, by an easy induction on the number 
of participating processes we prove that R satisfies T. □ 



B Proof for the reduction to (Section 14.11) 

The algorithm sketched in Figure Q] describes the steps to be taken by ^-processes qi,...,q n to 
emulate First we describe the asynchronous algorithm A s j m used by the C-processes to 

simulate runs of A, given a sample of the output of T>. Then we describe how the S-processes use 
multiple simulated runs of A s ^ m to emulate the output of -ifi^. 

Asynchronous simulation of A. Following the technique of Chandra et al. [D], we represent 
a sample of the failure-detector output in the form of a directed acyclic graph (DAG). The DAG 
is constructed by the S-processes by periodically querying T> and collecting the output values: 
every vertex of the DAG has the form [qi, d, k] which conveys that the k-th. query of T> performed 
by process qi returned value d. An edge between vertexes [qi, d, k] and [qj, d', k'\ conveys that the 
/c-th query of T> performed by qi causally precedes [23] the fc'-th query of V performed by qj. 

As in \28\ [TB] , any such DAG G can be used to construct a restricted algorithm A s 4 m . 

In A s j m the C-processes p\,...,p n simulates runs of A. The C-processes obtain input values 
for T and perform two parallel tasks. First, the C-processes take steps on behalf of A c '. Second, 
they use BG-simulation [HJ [7] to simulate a run of A s on qi,...,q n . But to simulate step of 
S-process instead of T> they use the information provided by G. More precisely, in the simulation, 
every S-process qi takes steps as prescribed by A s , except that when qi is about to query V, 
it chooses the next vertex [qi,d, k] causally succeeding the latest simulated steps of A s of all 
S'-processes seen by qi so far. If G was constructed in a run of A with failure pattern F, it is 
guaranteed that (1) every finite run simulated by A s ^ m is a run of A with failure pattern F, and 
(2) if the run of A s ^ m contains infinitely many simulated steps of processes in correct (F) then 
the simulated run is a fair run of A s with failure pattern F [28lfT8] . 

A s does not have inputs. Therefore, the simulation tries to promote all n S'-processes (but 
succeed to take step for a S-process qi if there is enough value for qi in G). 

If the simulated run of A generates an output value for pi, pi outputs this value and leaves 
the computation. Note that since T cannot be solved (k + l)-concurrently, and all runs of A are 
safe, there must be a (k + l)-concurrent (simulated) run of A in which some participating process 
takes infinitely many steps without outputting a value. 

Extracting -^fc. Now to derive -^fc, each S-process in i € {1, . . . , k} collects the output of T> 
in G and simulates locally multiple (k + l)-concurrent runs of A s ^ m . The runs are simulated in 
the corridor-based depth-first manner [TH] that works as follows. 

We assume a total order on the subsets P C Ii c so that if P C P' then P precedes P' in 
the order. Each initial state I and each schedule a, a sequence specifying the order in which 
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1 for all Iq , input vectors of T (in some order) do 

/* All possible inputs for pi,...,p n */ 

2 for all 7To, permutations of pi,. . . ,p„ (in some order) do 

/* All possible ' 'arrival orders'' */ 

3 Pq := the set of first k + 1 C-processes in ttq 

4 explore{I Q ,±,P ,TT Q ) 

5 function explore(I, a, P, tt) 

e -i^lk-outputi := n — k S'-processes that appear the latest in cti(I, a) 

(any n — k S'-processes if not possible) 
r if 3qj S n s : Ma' E dom{ ai ), 3a", a prefix of a': atj(I, a") is deciding then 

/* If all schedules explored so far were found deciding by qj */ 

8 adopt gj's simulation 

9 else 

10 TV := the set of undecided processes in (I, a) 

11 for all pj S P — N do /* For each decided process in P */ 

P:=P-{p 3 } 

13 P := P U {the first process in tt that does not appear in a} 

/* Replace pj with the next non-participant in tt */ 

14 for all P' C P (in some order consistent with C) do 

/*For all ''sub-corridors'' */ 
is for all pj G P' (in tt) do 

i6 explore(I, a ■ Pj,P', tt) 



Figure 1: Deriving -ifi^: code for each S'-process 

pi, . . . ,p n take steps of A s ^ m , determine a unique run of «4^ m simulated at process qi, denoted 
ati(I,a). 

For a given input vector / and a given permutation it of p±, . . . ,p n , that describes the order 
in which the C-processes "arrive" at the computation. Initially, we select a set P of the first 
k + 1 processes in tt as the participating set. Subsets P' C P are then explored as "corridors" 
(line I16p . in the deterministic order, from the narrowest (solo) corridors to wider and wider ones. 
Recursively, we go through simulating all runs in which only C-processes in P' take steps. In the 
course of simulation, if a participating C-process pj decides, we replace it with a process that has 
not yet taken steps in the current computation (line fT3|) . Since we only replace a decided process 
with a "fresh" non-participant, the participating set keeps the size of k + 1 or less processes. 
This procedure is repeated until every C-process decides. Thus, every simulated run is (k + 1)- 
concurrent. Once the exploration of the current corridor is complete (the call of explore in line [TE1 
returns), we proceed to the next corridor, etc. 

If, at some point, qi finds out that another S'-process qj made more progress in the simulation 
(simulated more runs than qi), then qi "adopts" the simulation of qj (line E]) by adopting q^s 
version of the DAG and the map otj [18] . 

The output of -ififc is evaluated as the set of the ids of the latest n — k processes in q±, . . . , q n 
that appear in the run of in the currently simulated run of A s j^ m (line [6]) . 

Recall that T cannot be solved (k + l)-concurrently and thus there must exist a (k + 1)- 
concurrent run of A s j^ m in which some participating live process never decides. Since the only 
reason for the run of -4. S ^ TO not to decide is the absence of some correct process in the simulated k- 
resilient run of -A^ m , and the emulated output eventually never contains some correct process — 
-ifife is emulated. Thus: 

Theorem [8] Let T be a task that cannot be solved (k + 1)- concurrently. For every environment 
6, for every failure detector T> that solves T in 8, -if2^ is weaker than V in £. 

Proof sketch. Our reduction algorithm works as follows. Every S-process qi runs two parallel 
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tasks. First, it periodically queries its module of T> and maintains its directed acyclic graph Cj, 
as in [9j[T8]. Second, it uses Gj to locally simulate multiple runs of A s ^ m and emulates the output 
of -ififc. Consider any run of the reduction algorithm. Let F be the failure pattern of that run. 

First we observe that every simulated run of A s ^ m is (k + l)-concurrent. Indeed, initially, 
exactly (k + 1) C-processes participate and a new participant joins only after some participating 
C-process decides and departs. 

Then we show that the correct S-processes eventually perform the same infinite sequence of 
recursive invocations of explore: explore(I, _L, Po> 7r ) invokes explore(I, o~i, Pi, w), which in turn 
invokes explore(I,o~2,P2,'x), etc. (line I14p . Indeed, all S-processes perform the simulations in the 
same order and since, the task is not (k + l)-concurrently solvable, there must be a never deciding 
(k + l)-concurrent run of A s ^ m . Since all these Pi are non-empty, there exists £* and P* such 
that W > I* , P£ = P* . Since we proceed from narrower corridors to wider ones, P* is the set of 
live C-processes that never decide in the "first" never deciding (k + l)-concurrent simulated run 
with a schedule a*. 

Now we observe that all simulated runs eventually always extend a prefix a* of a* in which 
some simulated processes not in P* already took all their steps in a*. Moreover, there is a time 
after all explored extensions of a* only contain steps of processes in P* . By the properties of BG- 
simulation [5j [7] , every S-process that appears only finitely often in the run of A s i m simulated 
by a* (we called these processes blocked by a*) eventually never appears in all simulated run 
of A. Let U be the set of S-processes blocked by a*. Since the run of A s j^ m simulated by a* 
is (k + l)-concurrent, processes in U eventually never appear among the last n — k processes in 
a(I,a) (line ED. 

Now we observe that U must contain a correct (in F) S'-process. If it is not the case, i.e., U 
doesn't contain a correct S'-process, then the simulated run of A is fair and thus the simulated 
run of A must be deciding. 

Thus, eventually some correct S-processes never appear in -^Vlk-output^ at every correct S'- 
process qi — -ififc is emulated. □ 



C Proof for solving a fc-concurrent solvable task with -ifi^ (Sec- 
tion E2D 

This section presents a distributed algorithm that uses -ifife to solve, in any environment, any task 
that can be solved /c-concurrently. The result could have been obtained from the simulation of k- 
concurrency using (black-box) fe-set agreement objects [TB]. But for the sake of self-containment, 
we present a (simpler) direct construction of a ^-concurrent run using -ififcH 

First we describe an abstract simulation technique that uses (equivalent to — |28| ) to 
simulate, in a system of n C-processes, a run of an arbitrary asynchronous algorithm B on k 
C-processes. 

Then we apply this technique to show that, in every environment, we can use ilk to simulate 
a run R s im of any given n C-processes protocol A. If R is the current run, we have the following 
properties: (1) R s i m only contains steps of participating processes of R, (2) R s i m is ^-concurrent, 
and (3) every participating C-process of R that takes infinitely many steps is given enough steps 
in Rsim to decide. 

C.l Simulating k codes using -itt^ 

Suppose we are given a read- write algorithm B on k C-processes, p' 1 ,...,p r k . Assuming that Qk 
is available, the algorithm in Figure [2] describes how n simulators, C-processes pi,...,p n can 

8 The construction is similar to the one presented in [18] for the actively fc-resilient case. 
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simulate an infinite run of B. 

The simulation is similar in spirit to BG-simulation [5l \7\. Every simulator pi first regis- 
ters its participation in the shared memory and then tries to advance simulated C-processes 
(k,m) ' wnere m is the number of simulators that pi has witnessed participating. 

To simulate a step of p'j, simulators agree on the view of the C-process after performing the 
step. However, instead of the BG-agreement protocol of [5j[7], we use here a leader-based consensus 
algorithm [9]. In the algorithm, a process periodically (in every round r of computation), queries 
the current leader to get an estimate of the decision. 

Since in our algorithm both C-processes and S-processes can be elected leaders, we modify 
the algorithm of [9] as follows. When a process wants to get an estimate of the decision (say in 
round r), it publishes a query (query, est', r) in the shared memory (proposing its current estimate 
est'), waits until the current leader publishes a response (est,r), and adopts the estimate. For 
simplicity, we assume that every process (be it a C-process or a ^-process) periodically scans the 
memory to find new queries of the kind (query, est' ,r) and responds to them by publishing one 
of the proposed estimates. Furthermore, we assume that each S'-process periodically updates the 
shared array Q, kS[l, ■ ■ ■ ,k] with the output of its module of Qk- Recall that eventually some 
position Qk-S[j] (j € {1, . . . , k}) stabilizes on the identity of some correct S'-process. 

The resulting algorithm terminates under the condition that all C-processes eventually agree 
on the same correct leader. The instance of the consensus algorithm used to simulate ^-th step 
of C-process p'j is denoted by consj/. 

The rule to elect the leader is the following. As long as the number of participating simula- 
tors is k or less, the participating simulator with the j-th smallest identity acts as a leader for 

simulating steps of p'j. When the number of participating simulators exceeds k, the leader for 

— > 

simulating steps of Pj is given by Q,k-S[j]. 

In both cases, at least one simulated C-process is eventually associated with the same correct 
leader. Thus, at least one simulated C-process makes progress in the simulation. 

The algorithm also assumes that a simulator pi may decide to leave the simulation if the 
simulated run produced a desired output (line [28]). We use this option in the next section. 

Theorem 14 In every environment, the protocol in Figured simulates an infinite run of any k- 
processes algorithm B (as long as there is at least one not decided participating simulated process). 
Moreover, if £ simulators participate, i.e., \pars\ = i, then at most min(fc,£) processes participate 
in the simulated run. 

Proof. Consider an infinite run of the algorithm. Since every next state of each simulated process 
p'j is decided using a consensus algorithm, every simulator observes exactly the same evolution of 
states for every simulated process. Thus, the simulated schedule indeed belongs to a run of B. 

Now consider the construction of variables Leader\, . . ., Leader^ used by the consensus algo- 
rithms consi/, . . ., const/ (lines l3Tlj36p . Let I be the number of participating simulators. 

If I < k, the simulator with the j'-th smallest identity in pars is assigned to be the leader 
of exactly one simulated process p'j. Since at least one simulator is correct, there exists p'j 
(j = 1, . . . , (pars |) such that all instances consj^ j using Leader j eventually terminate. Thus, p'j 
accepts infinitely many steps in the simulated run. 

If t > k, at least one Leader j (j = l,...,k) eventually stabilizes on some correct process 
identity, as guaranteed by the properties of fi^. Again, p'j takes infinitely many steps in the 
simulated run. 

In both cases, at most min(^, k) simulated processes appear in the produced run of B, and at 
least one simulated process takes infinitely many steps. □ 
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Shared variables: 

Rj, j = 1, . . . , m, initially _L 

V^-, j = 1, . . . , k, initially the initial state of p'j 

& k-S[j], j = 1, . . . , n, initially qi 
Local variables: 

Leader j = 1, . . . , k, initially p\ 
lj, j = 1, . . . , k, initially 1 
Vj, j = 1, . . . , k, initially _L 

Task 1: 



17 R, := 1 

18 undecided := true 

id for j = 1, . . . , k do vj := {Vi, . . . , V k } 

20 while undecided do 

21 for j = 1, . . . , mm(\pars\, k) do 

22 perform one more step of consj^^Vj) using Leader j as a leader 

23 if consj^ivj) returns v then { The next state of p'j is decided } 

24 Vj := v { Adopt the decided state of p'j } 

25 simulate the next step of p'j in B 

26 if v allows pi to decide then { The simulator can depart } 

27 undecided :— false 

28 R{ := _L 

29 vj := {Vi, . . . , Vk} { Evaluate the next state of p'j } 

30 ij := lj + 1 

Task 2: 

31 while true do 

32 pars := {pj, Rj ^ _L} 

33 if | pars | < k then 

34 for j = 1, . . . , |pars| do Leader j := the j-th smallest process in pars 

35 else 

36 for j = 1, .. . , k do Leader j := £lk-S\j] 



Figure 2: Simulating k codes using vector-O^: the program code for simulator pi 
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C.2 Solving a ^-concurrent task with ->Qk 



Theorem [9] Let T be any k- concurrently solvable task. In every environment £, —>Qk solves T 
in £. 

Proof. Let A be the algorithm that solves T fc-concurrently. We simply employ the simulation 
protocol in Figure [2] (Theorem [Til) , and suppose that the simulated algorithm B is Extended 
BG-simulation [15] for A. More precisely, B simulates with k C-processes the algorithm A with 
n C-processes. 

Thus, the double simulation is built as follows. Every process pi writes its input value of T 
to the shared memory and starts the simulation of k processes . . . ,p' k using the algorithm in 
Figure [2j The simulated processes p' lt . . . ,p' k run, in turn, BG-simulation of A on n processes 

n" n" 

Each simulated process p'j is simulated only if the corresponding pj has written its input of T 
in the shared memory and p'j has not yet obtained an output in the simulated run. Moreover, to 
make sure that the simulation indeed produces a ^-concurrent run, at any point of the simulation, 
each simulator in p'- £ {p[, . . . ,p' k } tries to advance the participating and not yet decided process 
with the smallest id. If the currently simulated process is found blocked [SJE], i.e., the process 
cannot advance because another simulator started simulating a step of it but has not yet finished, 
p'j proceeds to the next smallest undecided participating process in {p", . . . ,p!^}- Since there are 
at most k simulators, at most k — 1 undecided participating processes can be found blocked and 
thus there are at most k undecided participating processes at a time — the resulting simulated run 
is ^-concurrent. 

When p" obtains an output, the corresponding simulator pj considers itself "decided" (line [26]), 
writes _L in Ri (line l28l) and departs. 

If p\ cannot make progress because each code it tries to simulate is blocked and there are no 
more codes to add, it "aborts" all blocked agreements [15] and resumes the simulation. Since, at 
each point of time, the number of simulated codes does not get below the number of simulators 
that take steps, the simulation keeps making progress. 

Thus, as long as t processes {pj ± , ■ ■ ■ ,Pj e } participate, only min(fc, £) processes in {p'{, . . . ,p^} 
take steps, which results in a /c-concurrent simulated run of A. Every process p 1 - that takes steps 
eventually decides in a fc-concurrent run of A and the corresponding simulator pj departs. As 
soon as the decided process pi departs by writing _L to Ri, we have one simulator pi and one 
simulated process p 1 - less. Therefore, as long as there is a simulator taking steps and the run is 
fair, the simulated run makes progress, i.e., more and more participants decide. Thus, we obtain 
an algorithm that, in every environment, solves T. □ 



D Proof for characterizing the task of renaming (Section [5]) 

To illustrate the utility of our framework, we consider the task of (j,£) -renaming [3]. The task 
is defined on n (n > j) processes and assumes that in every run at most j processes participate 
(at least n — j elements of each vector Id are _L). As an ouput, every participant obtains a 
unique name in the range {1, . . . ,£} (every non-_L element in each O E O is a distinct value in 
{!,... ,£})■ 

We show first that (j, j)-renaming (also called strong j-renaming) is not 2-concurrently solv- 
able. Then we present a generic algorithm that, for all k = 1, . . . ,j, solves + k — l)-renaming 
in all ^-concurrent run, and thus (J,j + k — l)-renaming can be solved (in IFD) using -lil^. 

D.l Impossibility of 2-concurrent strong 2-renaming 
Lemma 1111 Strong 2-renaming cannot be solved 2-concurrently. 
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Shared variables: Rg, £ = 1, . . . , n, initially _L 

37 Ri := 1 /* register participation */ 

38 repeat 

39 S := {pe | Re 7^ -L} /*g e ' t tne current participating set */ 

40 S' := {pf | i?f = 1} /*g et 'the set of not yet decided participants */ 

41 mini '■= min(S") 

42 if (\S'\ = 1 ) then mini := mini else min2 '■= min(S" — min(«S")) 

43 if (151 = j and (p, = mini or pi = mini)) or = j — 1 and = mini) then 

44 take one more step of A /*if among two not decided with smallest ids */ 

45 until decided 

46 i?i := 

47 return the name decided in A 



Figure 3: A 1-rcsilicnt strong ^'-renaming algorithm: code for each C-process pi. 

Proof. We start with showing that for the special case of j = 2, strong renaming cannot be 
solved 2-concurrently. Suppose, by contradiction, that there exists a (restricted) algorithm A 
that solves (2, 2)-renaming 2-concurrently. Since we assumed j < n, we have at least 3 processes 
in the system. By the pigeon-hole principle, there exist two processes that decide on the same 
name v £ {1,2} in their solo runs of A. Without loss of generality, let these processes be p\ and 
P2 and let v be 1. 

Now pi and p2 can wait-free solve 2-processes consensus as follows. Each process publishes 
its input and then runs A until it obtains a name. If the name is 1, the process decides on its 
input, otherwise it decides on the input of the other process. Since a process in {pi,P2} obtains 
1 as a name in a solo run of A, if 1 is not obtained, then the other process participates in the run 
of A and, thus, has previously written its input. Therefore, every decided value was previously 
proposed. Since every obtained name is distinct, the two processes cannot decide on different 
values. This conclude the proof that strong 2-renaming cannot be 2-concurrently solvable. □ 



Theorem 1121 For all 1 < j < n, strong j-renaming cannot be solved 2-concurrently. 

Proof. By Lemma [TTT we have already the result for j = 2. Suppose, by contradiction, that for 
some 2 < j < n, there exists an (restricted) algorithm A solving strong j-renaming 2-concurrently. 
As we deal here with 2-concurrent solvability, we are only interested by the C-processes and their 
algorithms. We use A to solve strong j-renaming in all 1-resilient runs, i.e., runs in which at least 
j — 1 C-processes participate and take infinitely many steps. Recall that at most j C-processes 
participate in every run, so either j — 1 or j processes take infinitely many steps. In the algorithm 
(Figure [3]), every process registers its participation (line l3T|) and then periodically checks the 
current set of participants (line I39j) . If it finds out that it is among 2 processes with the smallest 
identities among j participating but not yet processes (line I43|) . then it starts taking steps A until 
the algorithm provides pi with a new name. Then pi declares that it has decided (line H6j) and 
departs. 

Note that the resulting run of A is 2-concurrent: either the participating set is of size j — 1 
and only the not yet decided participant with the smallest identity is allowed to take steps of A 
solo, or exactly j processes participate and the two not yet decided processes with the smallest 
identity are allowed to take step concurrently. 

Now we observe that the run of A continues as long as there is at least one not yet decided 
participant that take steps. Indeed, either the participating set is of size j — 1 and every participant 
takes an infinity number of steps (including the not yet decided one with the smallest identity) 
or exactly j C-processes participate and at least one of the not yet decided processes with the 
two smallest identity takes an infinity number of steps. Thus, every C-process that keeps taking 
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Shared variables: 

Ri, £ = 1, . . . , n, initially _L 

48 s := 1 

49 repeat forever 

so Ri := (i,s,true) /*register new name*/ 

51 S :— {pt | R( 7^ _L} /*collect suggested names*/ 

52 if 3(£, Si, b) G S: i ^ £ and ,s = S£ then 

53 r := the rank of i in {£ (£, si, b) £ S,b = true} 

/*rank among not yet decided participants*/ 

54 s := the rth integer not in {si : | (£, Si, 6) eS,i/f} 

/*suggest a new name among not yet suggested*/ 

55 else 

56 Ri := (i, s, false) 

57 return s 



Figure 4: A fc-concurrent (J,j + k — l)-renaming algorithm: code for each process pi. 

steps of A in the resulting 2-concurrent run eventually decides and departs. The set of undecided 
participants gets smaller by one, and the next C-process with the smallest identity joins the 
2-concurrent run of A. 

But it is shown in [15] that if all 1-resilient runs of a restricted algorithm A satisfy strong 
j'-renaming then there is a restricted algorithm to solve strong 2-renaming 2-concurrently — a con- 
tradiction with Lemma [TT] □ 



D.2 Solving renaming 

The distributed algorithm used to solve + k — l)-renaming /^-concurrently essentially mimics 
the algorithm of [3 Hj for wait-free (j, 2j — l)-renaming. 

Theorem 15 For all 1 < k < j < m, + k — l)-renaming can be solved k- concurrently. 

Proof. Our algorithm, described in Figure [H essentially mimics the algorithm of [3 S] for wait- 
free (j, 2j — l)-renaming. 

In the algorithm, every process periodically selects a new name according to the set of the 
names not yet suggested by other processes and its rank among the set of currently not yet decided 
participants (lines [53] and |54[) . 

Note that since at most j processes participate in every run, pi can observe at most j — 1 
names suggested by other processes in line [5TJ Furthermore, since in a ^-concurrent run, pi can 
observe at most k not yet decided participants, its rank can be at most k. Therefore, the highest 
name pi can suggest in line 1501 is j + k — 1. 

Now we show that no two processes output the same name. Suppose, by contradiction, that 
Pi and pj output the same name s. Thus, both pt and pj previously suggested s in line [50j But 
since after than both processes read each other's registers after that, at least one of them would 
see that s has been suggested by another process and thus would not decide — a contradiction. 

Finally, we show that every correct process eventually decides. Consider, by contradiction, 
an run R in which a set of correct processes {pj ± , ■ ■ ■ ,Pj t } (ordered by their ids) never decide. 
We call these processes trying. We establish a contradiction by showing that p~ x must eventually 
decide. Indeed, consider R', a prefix of R, in which only trying processes take steps, and let S 
be the set of names suggested by the processes not in {pj x , . . . ,Pj t } (note that this set does not 
change in R). Since, has the smallest rank among the trying processes (let us denote it by r), 
eventually no trying process will ever suggest the rth name not in S. Thus, pj t eventually finds 
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itself to be the only process to suggest the name and decides — a contradiction. 
From this result and Theorem [5J we can conclude: 

Theorem 16 For all 1 < k < j < m, + k — 1) -renaming can be solved with 
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